18 research outputs found
A Thermodynamic Interpretation of Time for Superstring Rolling Tachyons
Rolling tachyon backgrounds, arising from open strings on unstable branes in
bosonic string theory, can be related to a simple statistical mechanical model
- Coulomb gas of point charges in two dimensions confined to a circle, the
Dyson gas. In this letter we describe a statistical system that is dual to
non-BPS branes in superstring theory. We argue that even though the concept of
time is absent in the statistical dual sitting at equilibrium, the notion of
time can emerge at the large number of particles limit.Comment: 6 pages, 3 figures, v2: reference added, v3: minor clarification,
version to appear in journa
Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialogue acts in
conversational speech, i.e., speech-act-like units such as Statement, Question,
Backchannel, Agreement, Disagreement, and Apology. Our model detects and
predicts dialogue acts based on lexical, collocational, and prosodic cues, as
well as on the discourse coherence of the dialogue act sequence. The dialogue
model is based on treating the discourse structure of a conversation as a
hidden Markov model and the individual dialogue acts as observations emanating
from the model states. Constraints on the likely sequence of dialogue acts are
modeled via a dialogue act n-gram. The statistical dialogue grammar is combined
with word n-grams, decision trees, and neural networks modeling the
idiosyncratic lexical and prosodic manifestations of each dialogue act. We
develop a probabilistic integration of speech recognition with dialogue
modeling, to improve both speech recognition and dialogue act classification
accuracy. Models are trained and evaluated using a large hand-labeled database
of 1,155 conversations from the Switchboard corpus of spontaneous
human-to-human telephone speech. We achieved good dialogue act labeling
accuracy (65% based on errorful, automatically recognized words and prosody,
and 71% based on word transcripts, compared to a chance baseline accuracy of
35% and human accuracy of 84%) and a small reduction in word recognition error.Comment: 35 pages, 5 figures. Changes in copy editing (note title spelling
changed
Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech?
Identifying whether an utterance is a statement, question, greeting, and so forth is integral to effective automatic understanding of natural dialog. Little is known, however, about how such dialog acts (DAs) can be automatically classified in truly natural conversation. This study asks whether current approaches, which use mainly word information, could be improved by adding prosodic information. The study examines over 1000 conversations from the Switchboard corpus. DAs were handannotated, and prosodic features (duration, pause, F0, energy and speakingrate features) were automatically extracted for each DA. In training, decision trees based on these features were inferred; trees were then applied to unseen test data to evaluate performance. For an allway classification as well as three subtasks, prosody allowed highly significant classification
over chance. Featurespecific analyses further revealed that although canonical features (such as F0 for questions) were important, less obvious features could compensate if canonical features were removed. Finally, in each task, integrating the prosodic model with a DAspecific
statistical language model improved performance over that of the language model alone. Results suggest that DAs are redundantly marked
in natural conversation, and that a variety of automatically extractable prosodic features could aid dialog processing in speech applications
Automatic detection of discourse structure for speech recognition and understanding.
We describe a new approach for statistical modeling and detection of discourse structure
for natural conversational speech. Our model is based on 42 ‘Dialog Acts’ (DAs),
(question, answer, backchannel, agreement, disagreement, apology, etc). We labeled
1155 conversations from the Switchboard (SWBD) database (Godfrey et al. 1992) of
human-to-human telephone conversations with these 42 types and trained a Dialog Act
detector based on three distinct knowledge sources: sequences of words which characterize
a dialog act, prosodic features which characterize a dialog act, and a statistical
Discourse Grammar. Our combined detector, although still in preliminary stages, already
achieves a 65% Dialog Act detection rate based on acoustic waveforms, and 72%
accuracy based on word transcripts. Using this detector to switch among the 42 Dialog-
Act-Specific trigram LMs also gave us an encouraging but not statistically significant
reduction in SWBD word error
Toward a Plan-Based Understanding Model for Mixed-Initiative Dialogues
This paper presents an enhanced model of plan-based dialogue understanding. Most plan-based dialogue understanding models derived from [Litman and Allen, 1987] as- sume that the dialogue speakers have access to the same domain plan library, and that the active domain plans are shared by the two speakers. We call these features shared domain plan consflaints. These assumptions, however, are too sflict to account for mixed- initiative dialogues where each speaker has a different set of domain plans that are housed in his or her own plan library, and where an individual speaker's domain plans may be activated at any point in the dialogue
Linguistically Engineered Tools for Speech Recognition Error Analysis
In order to improve Large Vocabulary Continuous Speech Recognition (LVCSR) systems, it is essential to discover exactly how our current systems are underperforming. The major intellectual tool for solving this problem is error analysis: careful investigation of just which factors are contributing to errors in the recognizers. This paper presents our observations of the effects that discourse (i.e., dialog) modeling has on LVCSR system performance. As our title indicates, we emphasize the recognition error analysis methodology we developed and what it showed us as opposed to emphasizing development of the discourse model itself. In the first analysis of our output data, we focussed on errors that could be eliminated by Dialog Act discourse tagging [JSB97] using Dialog Act-specific languagemodels. In a second analysis, we manipulated the parameterization of the Dialog Act-specific language models, enabling us to acquire evidence of the constraints these models introduced. The word error ..